Goto

Collaborating Authors

 optimal query complexity


Optimal Query Complexities for Dynamic Trace Estimation

Neural Information Processing Systems

We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly, such as during an optimization process. Specifically, for any $m$ matrices $\mathbf{A}_1,...,\mathbf{A}_m$ with consecutive differences bounded in Schatten-$1$ norm by $\alpha$, we provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $\epsilon$ error with $\delta$ failure probability with an optimal query complexity of $\widetilde{O}(m \alpha\sqrt{\log(1/\delta)}/\epsilon + m\log(1/\delta))$, improving the dependence on both $\alpha$ and $\delta$ from Dharangutte and Musco (NeurIPS, 2021).


Efficient Pure Exploration in Adaptive Round model

Neural Information Processing Systems

In the adaptive setting, many multi-armed bandit applications allow the learner to adaptively draw samples and adjust sampling strategy in rounds. In many real applications, not only the query complexity but also the round complexity need to be optimized. In this paper, we study both PAC and exact top-$k$ arm identification problems and design efficient algorithms considering both round complexity and query complexity. For PAC problem, we achieve optimal query complexity and use only $O(\log_{\frac{k}{\delta}}^*(n))$ rounds, which matches the lower bound of round complexity, while most of existing works need $\Theta(\log \frac{n}{k})$ rounds. For exact top-$k$ arm identification, we improve the round complexity factor from $\log n$ to $\log_{\frac{1}{\delta}}^*(n)$, and achieve near optimal query complexity. In experiments, our algorithms conduct far fewer rounds, and outperform state of the art by orders of magnitude with respect to query cost.


Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

We study the \emph{secure} stochastic convex optimization problem: a learner aims to learn the optimal point of a convex function through sequentially querying a (stochastic) gradient oracle, in the meantime, there exists an adversary who aims to free-ride and infer the learning outcome of the learner from observing the learner's queries. The adversary observes only the points of the queries but not the feedback from the oracle. The goal of the learner is to optimize the accuracy, i.e., obtaining an accurate estimate of the optimal point, while securing her privacy, i.e., making it difficult for the adversary to infer the optimal point. We formally quantify this tradeoff between learner's accuracy and privacy and characterize the lower and upper bounds on the learner's query complexity as a function of desired levels of accuracy and privacy. For the analysis of lower bounds, we provide a general template based on information theoretical analysis and then tailor the template to several families of problems, including stochastic convex optimization and (noisy) binary search. We also present a generic secure learning protocol that achieves the matching upper bound up to logarithmic factors.


Review for NeurIPS paper: Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

In terms of the presentation, the paper constantly switches in terminology, both referring to "secure" and "private." Given that by now differential privacy has been established as a notion of formal privacy, I believe it is better to refer to this problem to "secure" optimization. For example, in page 2, line 49, classic bounds are referred as "non-private." I think it would be better to exclusively refer to "secure" in this definitions. However, they are clearly vacuous for the classical "non-secure" case.


Review for NeurIPS paper: Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

This paper was carefully reviewed and discussed by our reviewer panel. The consensus was that this is nice work, the rebuttal had some sway, and the paper can be published in NeurIPS this year. But please do take into account the detailed comments of the reviewers when putting together your camera-ready version.


Optimal Query Complexities for Dynamic Trace Estimation

Neural Information Processing Systems

We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly, such as during an optimization process. Specifically, for any m matrices \mathbf{A}_1,...,\mathbf{A}_m with consecutive differences bounded in Schatten- 1 norm by \alpha, we provide a novel binary tree summation procedure that simultaneously estimates all m traces up to \epsilon error with \delta failure probability with an optimal query complexity of \widetilde{O}(m \alpha\sqrt{\log(1/\delta)}/\epsilon m\log(1/\delta)), improving the dependence on both \alpha and \delta from Dharangutte and Musco (NeurIPS, 2021). By using novel reductions to communication complexity and information-theoretic analyses of Gaussian matrices, we provide matching lower bounds for static and dynamic trace estimation in all relevant parameters, including the failure probability. Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error {\it even in the static setting}, and (2) are the first unconditional lower bounds for dynamic trace estimation, resolving open questions of prior work.


A Complete Characterization of Learnability for Stochastic Noisy Bandits

Hanneke, Steve, Wang, Kun

arXiv.org Machine Learning

We study the stochastic noisy bandit problem with an unknown reward function $f^*$ in a known function class $\mathcal{F}$. Formally, a model $M$ maps arms $\pi$ to a probability distribution $M(\pi)$ of reward. A model class $\mathcal{M}$ is a collection of models. For each model $M$, define its mean reward function $f^M(\pi)=\mathbb{E}_{r \sim M(\pi)}[r]$. In the bandit learning problem, we proceed in rounds, pulling one arm $\pi$ each round and observing a reward sampled from $M(\pi)$. With knowledge of $\mathcal{M}$, supposing that the true model $M\in \mathcal{M}$, the objective is to identify an arm $\hat{\pi}$ of near-maximal mean reward $f^M(\hat{\pi})$ with high probability in a bounded number of rounds. If this is possible, then the model class is said to be learnable. Importantly, a result of \cite{hanneke2023bandit} shows there exist model classes for which learnability is undecidable. However, the model class they consider features deterministic rewards, and they raise the question of whether learnability is decidable for classes containing sufficiently noisy models. For the first time, we answer this question in the positive by giving a complete characterization of learnability for model classes with arbitrary noise. In addition to that, we also describe the full spectrum of possible optimal query complexities. Further, we prove adaptivity is sometimes necessary to achieve the optimal query complexity. Last, we revisit an important complexity measure for interactive decision making, the Decision-Estimation-Coefficient \citep{foster2021statistical,foster2023tight}, and propose a new variant of the DEC which also characterizes learnability in this setting.


Optimal Query Complexity of Secure Stochastic Convex Optimization

Neural Information Processing Systems

We study the \emph{secure} stochastic convex optimization problem: a learner aims to learn the optimal point of a convex function through sequentially querying a (stochastic) gradient oracle, in the meantime, there exists an adversary who aims to free-ride and infer the learning outcome of the learner from observing the learner's queries. The adversary observes only the points of the queries but not the feedback from the oracle. The goal of the learner is to optimize the accuracy, i.e., obtaining an accurate estimate of the optimal point, while securing her privacy, i.e., making it difficult for the adversary to infer the optimal point. We formally quantify this tradeoff between learner's accuracy and privacy and characterize the lower and upper bounds on the learner's query complexity as a function of desired levels of accuracy and privacy. For the analysis of lower bounds, we provide a general template based on information theoretical analysis and then tailor the template to several families of problems, including stochastic convex optimization and (noisy) binary search.


Efficient Pure Exploration in Adaptive Round model

Neural Information Processing Systems

In the adaptive setting, many multi-armed bandit applications allow the learner to adaptively draw samples and adjust sampling strategy in rounds. In many real applications, not only the query complexity but also the round complexity need to be optimized. In this paper, we study both PAC and exact top- k arm identification problems and design efficient algorithms considering both round complexity and query complexity. For PAC problem, we achieve optimal query complexity and use only O(\log_{\frac{k}{\delta}} *(n)) rounds, which matches the lower bound of round complexity, while most of existing works need \Theta(\log \frac{n}{k}) rounds. For exact top- k arm identification, we improve the round complexity factor from \log n to \log_{\frac{1}{\delta}} *(n), and achieve near optimal query complexity.


Active Learning Ranking from Pairwise Preferences with Almost Optimal Query Complexity

Neural Information Processing Systems

Given a set V of n elements we wish to linearly order them using pairwise preference labels which may be non-transitive (due to irrationality or arbitrary noise). The goal is to linearly order the elements while disagreeing with as few pairwise preference labels as possible. Our performance is measured by two parameters: The number of disagreements (loss) and the query complexity (number of pairwise preference labels). Our algorithm adaptively queries at most O(n\poly(\log n,\eps {-1})) preference labels for a regret of \eps times the optimal loss. This is strictly better, and often significantly better than what non-adaptive sampling could achieve.